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Abstract —In this paper, we discuss on the use of self-organizing 
protocols to improve the reliahility of dynamic Peer-to-Peer (P2P) 
overlay networks. Two similar approaches are studied, which are 
based on local knowledge of the nodes’ 2nd neighborhood. The 
first scheme is a simple protocol requiring interactions among 
nodes and their direct neighbors. The second scheme adds a 
check on the Edge Clustering Coefficient (ECC), a local measure 
that allows determining edges connecting different clusters in the 
network. The performed simulation assessment evaluates these 
protocols over uniform networks, clustered networks and scale- 
free networks. Different failure modes are considered. Results 
demonstrate the effectiveness of the proposal. Index Terms — 
omplex Networks Self-organization Peer-to-Peeromplex Networks 
Self-organization Peer-to-PeerC 

I. Introduction 

A significant part of the research in Peer-to-Peer (P2P) 
systems of the last years has been in the design of overlay 
networks. Overlay networks operate at the application layer, 
on top of the traditional Internet transport protocols. Each node 
in the overlay is a peer that has an unique “id”. Messages are 
routed to a node based on that application level id and through 
the overlay links, rather than on a communication based on IP 
addresses. 

A main outcome of these studies was the introduction of 
P2P structured architectures. In essence, these are architectural 
solutions where links among nodes are created based on the 
contents hold by nodes. Distributed Hash Tables (DHTs) are 
peculiar examples of these systems [7], [8], [28]. 

Conversely, unstructured P2P overlays represent networks 
where links among nodes are established arbitrarily. Peers 
locally manage their connections to build some general desired 
topology and links do not depend on the contents being 
disseminated [17]. They are particularly simple to build and 
manage, with little maintenance costs, yet at the price of a 
non-optimal organization of the overlay. Unstructured overlays 
can be used as a building block in a variety of distributed 
applications, especially when the environment, where the 
application is run, is highly dynamic. Examples are concerned 
with system monitoring [46], failure detection [43], messaging, 
resource discovery [11], [18], [35], [44], management of flash 
crowd crises over gossip-based information dissemination [5], 
[17]. The use of unstructured overlays enables scalable and 
efficient solutions that obviate the need for a structure [35], 
[48], [51]. 

Unstructured P2P systems aim at exploiting randomness to 
disseminate information across a large set of nodes. A key 


issue is to keep the overlay connected even in the event of 
major disasters, without maintaining any global information 
or requiring any sort of administration. Connections between 
nodes in these systems are highly dynamic. 

This work focuses on a decentralized self-healing algo¬ 
rithm that aims providing resilience of unstructured overlay 
networks. The approach exploits local knowledge that each 
node has about its neighborhood, i.e., nodes that are linked to 
it in the overlay. In particular, each node n maintains and 
actively manages the list of nodes directly connected to it 
(i.e. its neighbors), and the neighbors of its neighbors (the 
so called 2nd neighbors). In a network overlay the failure of 
a neighbor can disrupt, or at least worsen, the communication 
capabilities of a node with the rest of the network. To avoid 
this, the node n reacts to these failures by running a self- 
healing procedure, so as to get back those connections with 
2nd neighbors which were lost. A contention among n and 
its 1st neighbors is performed to replace the lost connection. 
Thus, only one among these nodes creates such a link; this 
way, nodes share the load for the creation and management of 
these novel links [19]. 

Together with this basic self-healing protocol, a variation is 
proposed that exploits the notion of Edge Clustering Coeffi¬ 
cient (ECC) [42]. This metrics is a local measure that identihes 
those edges connecting different clusters. In fact, the ECC 
associated to a link counts the number of triangles it belongs, 
with respect to the number of triangles that might potentially 
include it. The lower the ECC of a link the lower the short 
paths connecting the two nodes that share that link (since 
they are in few common triangles). Since many triangles exist 
within clusters, ECC is a measure of how inter-communitarian 
a link is. 

Based on this ECC, a second version of the protocol is 
presented, according to which a node n decides to activate the 
self-healing procedure with a probability which is inversely 
proportional to the ECC of the link lost upon a neighbor 
failure. In other words, the more the link was part of triangles, 
the lower the probability of triggering the recovery procedure. 
The recovery procedure consists in creating links with the lost 
2nd neighbors, as described above. The idea is that in this 
case, a node might avoid to activate the self-healing procedure 
for those lost links with higher ECC values. Not only, with 
the aims of preserving the network topology and of limiting 
the potential growth on the number of links in the network, 
a link removal phase is included in the protocol. Basically, 
it removes (with a certain probability) links with higher ECC 
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values, associated to nodes with a degree exceeding their target 
degree. 

A simulation assessment is presented that studies the pro¬ 
tocols over uniform networks (where links are created by 
randomly choosing nodes as neighbors), clustered networks 
and scale-free networks. These different network topologies 
are exemplars that model different P2P systems. Uniforms net¬ 
works (with links created as random graphs) resemble typical 
data sharing P2P systems, where usually peers connect to a 
almost static (and quite often pre-configured) amount of peers, 
to share data with. This number of neighbors is a trade-off to 
avoid, on one hand, that a low number of connections limits 
the sharing capabilities, and on the other hand, that a too high 
amount of neighbors causes an unbearable communication and 
computation overhead for a peer. Clustered networks allow to 
consider those situations where there are clusters of nodes that 
share several connections while there are fewer connections 
among different clusters. This is a typical situation in social 
networks and the like. Scale-free networks are considered 
the main network topology that models most real networks 
[12], [37]. For example, it has been recognized that the well- 
known Gnutella overlay is a scale-free network. Moreover, 
there is evidence that the overlay created in Skype has several 
hubs (i.e. nodes with many connections much higher than the 
majority of other nodes), suggesting that this type of network 
is a scale-free [6]. 

Different types of simulations are considered with different 
types of node removals. The first mode was based on a random 
selection of nodes that fail, in a situation where the amount of 
failed nodes is equal to the amount of joining nodes. Second, 
a “targeted attack” was simulated, meaning that at each step 
the “important” nodes with some specific characteristics were 
selected to fail. In particular, as concerns uniform and scale- 
free networks, nodes with the higher degrees were selected 
to fail. Instead, in clustered networks the selected nodes were 
those with higher number of links connecting different clusters 
(the rationale was to augment the probability of disconnecting 
the clusters). A variation of the targeted attack is considered, 
where removed nodes are those with the highest betweenness 
centrality value. Finally, another mode was set where only 
failures occurred. 

Results demonstrate that the presented self-healing ap¬ 
proaches preserve networks connectivity, coping with node 
churn and targeted attacks. Moreover, the use of the ECC can 
lower the clustering coefficient on the overlay (depending on 
its topology). 

The remainder of this paper is organized as follows. Section 
II discusses on some background and related studies available 
in the literature. Section III presents the P2P protocol. Section 
IV describes the simulation environment, while Section V 
discusses the obtained results. Finally, Section VI provides 
some concluding remarks. 

II. Related Work 

Several works have been presented in the literature, which 
focus on self-organization of P2P systems and their robustness 
to failures and node departures. One of the most fascinat¬ 
ing aspects of the presented distributed approaches is that 


peers can execute local strategies in order to maintain some 
global properties of the overall network through decentralized 
interactions. These global properties are usually referred as 
self-* properties (e.g., self-organization, self-adaptation, self¬ 
management). Peers might interact in order to self-organize 
the contents they maintain (e.g., [22], [25]), or even the 
connections each peer maintains with other peers (i.e., links in 
the overlay). Among all these possibilities, self-healing figures 
as a key characteristics to improve the dependability of the 
managed infrastructure. Self-healing is not novel in networks. 
It is an interesting approach to cope with the general problem 
of providing network resilience [14]. It has been a long time 
since self-healing ring topologies have been introduced. In the 
domain of P2P (and networks), several works concerned with 
this issue have been proposed [10], [17], [40]. 

However, in P2P systems, certain network properties are 
guaranteed usually on the steady state. Thus, it may happen 
that they disappear in case of multiple node departures. For 
instance, the overlay might get partitioned upon failure of links 
connecting different clusters. Alternatively, some important 
links might be lost that were playing a main role to keep a low 
network diameter. For instance, in small worlds there are links 
among distant nodes that strongly reduce the average shortest 
path length. Although the P2P network is unstructured, it has 
certain characteristics that should be maintained, at least up 
to a certain extent, in order to provide some guarantees and 
the ability of the network to spread contents. The purpose of 
this work is to understand if some decentralized self-healing 
algorithm can guarantee the resilience and the communication 
capabilities of a P2P system. 

In the literature, some works make a distinction between 
reactive and proactive approaches. In essence, with reactive 
approaches novel links are created only when nodes join, 
leave, or when a failure is detected. This is different from 
proactive approaches, where nodes try periodically to find new 
neighbors to link at [41]. 

Basically, reactive overlay recovery mechanisms may work 
by resorting to either centralized or decentralized approaches 
to identify novel peers. According to a centralized approach, a 
peer that ’’needs neighbors” contacts a set of well-known nodes 
that answer with a list of nodes. This approach is exploited 
in general P2P systems; for instance in BitTorrent this role 
is played by the tracker. Also Gnutella exploits this kind 
of strategy. This method is adequate when the P2P overlay 
is unstructured or loosely organized; however, its weakness 
relies on the robustness of these well-known nodes. If they 
are reliable nodes in the network (such as public trackers in 
BitTorrent, that are in charge of this service only), the system 
stays up. Failures of such nodes may cause the whole system 
to partition or crash. 

In a reactive decentralized approach, a peer locally asks its 
neighbors to provide information on nodes it is not connected 
to. This method is widely used in structured P2P systems 
[53]. These schemes require information on how to make con¬ 
nections between independent components when an overlay 
partition occurs [41]. 

SCAMP is a prominent example of a reactive recovery 
approach [23]. It is a gossip-based protocol where the neigh- 
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borhood size of each node adapts w.r.t. a-priori unknown size 
of the whole system. Thus, each node can modify its set of 
neighbors when system size changes. 

Similarly, Phenix is an approach that creates robust topolo¬ 
gies with a low-diameter [52]. In particular, it creates scale- 
free networks, which are well known to be tolerant to node 
random removals. The approach specifically focuses on the 
particular case where malicious nodes try to collect informa¬ 
tion on the network in order to devise targeted attacks. Such a 
scenario is avoided by hiding information to those nodes that 
are in local black lists. 

As concerns proactive strategies, in the literature seminal 
works have been proposed that build a peer-sampling service. 
Such a service provides nodes with a randomly chosen set 
of neighbors to exchange information with. Typically, this 
information exchange is realized through gossip approaches 
[21]. The set of neighbors creates a dynamic unstructured 
overlay. These approaches mainly differ in the way new nodes’ 
neighbor lists are built, after merging and/or truncating the 
neighbor lists of communicating peers. 

For instance, Cyclon is a popular scheme that allows to 
construct gossip-based unstructured P2P systems that have 
low diameter, low clustering, highly symmetric node degrees, 
and that are highly resilient to massive node failures [49]. 
Is is a quite inexpensive membership management, where 
nodes maintain a small, partial view of the entire network. 
According to this protocol, nodes periodically perform a 
shuffling protocol which ensures that peers maintain a list 
of active neighbors. The difference with our scheme is that 
this approach builds a specific and robust overlay, with given 
topology characteristics. Instead, the aim of the approach 
described in this paper is to have a decentralized protocol that, 
given a certain unstructured P2P overlay with any possible 
characteristics, reacts to important failures to avoid further 
network partitioning. 

An approach that is conceptually similar to Cyclon is that 
proposed in [45]. It uses a randomized overlay construction 
method to provide network robustness. 

Newscast is a gossip-based protocol that builds and main¬ 
tains a continuously changing random overlay [32]. The gen¬ 
erated topology is built to ensure stability and connectivity. 
The idea is that each node modifies periodically its set of 
neighbors by randomly exchanging information with nodes 
it is connected with. Thus, a continuous rewiring strategy is 
performed. 

With respect to this reactive/proactive classification, it is 
worth mentioning that our proposed approach enables nodes 
to react to node disconnections, by creating novel links with 
nodes that have been proactively discovered before the failure. 
Thus, the peer discovery is proactive (and local), while the link 
creation is reactive. A similar philosophy is exploited in [41]. 
Moreover, our proposed approach requires local information 
only, hence maintaining the amount of information to be 
exchanged in background quite limited. 

Several interesting works look at ways to form “good” 
topologies. One example is [39], which focuses on building 
randomized topologies with bounds on the overlay graph 
diameter. In general, the topology of the overlay has a strong 


influence on the performance of the information dissemination, 
nodes workload and on the overlay robustness. For instance, 
if a scale-free network is employed, then the network has 
a low diameter and it is robust to random node failures. 
However, a scale-free net contains a non-negligible fraction 
of peers which maintain a high number of active connections, 
and hence they sustain a workload higher than low-degree 
nodes. Conversely, if a network has a more uniform degree 
distribution, then the workload is equally shared among all 
peers. However, the diameter of the network increases, and so 
does the number of hops needed to cover the whole network 
with a broadcast [15]. Therefore, some approaches in the 
literature force the use of a specific topology. The scheme 
presented in this work has a different goal. It copes with locally 
important failures that might partitionate the overlay, without 
affecting that much the original topology of the overlay. Thus, 
our scheme aims at augmenting network resilience and it can 
be coupled with other approaches that create some overlay 
with certain features. Indeed, in the performance assessment 
section, the proposed algorithm is evaluated over different 
overlay topologies. 

III. Self-Healing Protocols 
A. System Model 

We consider P2P systems built on top of an unstructured 
overlay network. (Note that in the following the terms “peer” 
and “node” are employed as synonyms.) No assumptions 
are made on the topology of the overlay. In fact, it is not 
the aim of the protocol to build an overlay with specific 
characteristics. Rather, the idea is to provide a simple protocol 
that augments the reliability of an overlay, whatever its starting 
topology, during its evolution with nodes that enter and leave 
the overlay, dynamically. For simplicity, we consider networks 
with undirected links. Actually, this setting is quite common 
in many P2P systems, e.g., BitTorrent, Gnutella [47]. 

Each node n has a certain degree, i.e. the amount of 1st 
neighbors or, in other words, the nodes directly connected 
with n in the overlay. The list of these n’s 1st neighbors 
is denoted with n„, while the degree of n is denoted with 
|n„|. n maintains also the list of its 2nd neighbors, H^, 
i.e. nodes distant 2 hops from n. Every time the list n„ 
changes, due to some node arrival or departure, n informs its 
other 1st neighbors of this update. With H^j^ = H^ — n„, we 
identify the n’s 2nd neighbors which can be reached through 
TO. Hence, H^ = Ufegn„n^|fc- The discussed protocols employ 
a threshold on the maximum node degree. In Section IV-A a 
discussion on such a threshold is reported, and in Section V-E 
we show a study on the impact of this threshold. 

As concerns failures, for the sake of a simpler discussion, 
we assume that only nodes can fail, while it cannot happen that 
single links are removed from the overlay. This is a common 
simplification made in most P2P system models. Anyway, 
the protocol can be easily upgraded (without any substantial 
modifications) to handle single link failures. We assume that 
a failure detection service is employed, that informs a node 
upon a 1st neighbor failure. This service can be implemented 
using some sort of ’’keep alive” mechanism, such as [30], [53]. 
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Fig. 1. Example of a node failure, as managed at a node n (green node). 
Upon failure of a neighbor /, n needs to replace the lost 2-hop connection 
with s. Nothing has to be accomplished at n for other nodes, since q, r are 
n 1st neighbors, while m can be reached through q. 

Nodes can join and leave the network dynamically. We 
assume that network changes (in a given neighborhood) are 
slower than a given execution of the communication protocol 
[16]. Thus, in general, upon failure of a node /, its neighbor 
n is enabled to send messages to When not differently 

stated, we will consider cases when nodes arrivals and depar¬ 
tures occur at the same rate.' 

B. Protocol P 2 n- Use of the 2-Neighborhood 

Upon a neighbor / € n„ departure, by looking at 
each node n is able to understand if some 2nd neighbor is no 
more reachable. If this is the case, this protocol ensures that 
n, or one of its neighbors, creates a link with it. Algorithms 

1- 2 sketch the related pseudo-code. In particular, when a node 
/ fails, Vp G 11/ there are three possible cases. 

1) p G n„ : n and p are neighbors. In this case there is 
nothing to do (at n). 

2) p ^ n„, but p G since p G for some q G 

UrijQ p is still a 2nd neighbor of n; also in this 

case there is nothing to do. 

3) p ^ n„,p ^ : after the failure p is no more a 

1st or 2nd neighbor of n. In this case, n takes part to 
the distributed procedure to create a link with p (see 
Algorithm 1). 

In essence, links are created among nodes which were 
connected through / only. This list is computed by analyzing 
the old view n had of its 2nd-neighborhood, before removing 
its connection information about / and (Algorithm 1, 

line 1). Take as an example the situation reported in Figure 1. 
In this case, upon failure of /, all dashed links are removed. 
Focusing on node n, this node will need to replace its lost 

2- hop connection with s, while other nodes remain still 1st 
neighbors (node q,r) or 2nd neighbors (node m). 

As already mentioned, each node n keeps a threshold value 
for its degree, to avoid that its degree grows out of control 
(Algorithm 1, line 3). (This threshold should not be too low, 
otherwise it might contrast the creation of additional links, and 
this might generate network partitions.) Moreover, in order to 
diminish the probability that multiple nodes of the same cluster 
attempt to create a novel link with the same node p at the same 
time, a classic contention-based approach is used, so that each 
node n waits for a random time before transmitting messages 

' This will be the scenarios of the so called “evolution” and “targeted attack” 
simulation modes, while in the “failures only” the aiiival rate is set to 0, 
mimicking a worst chum scenario. 


(Algorithm 1, line 4). Such a random waiting time is generated 
within a predefined time interval, using a uniform distribution. 
This way, each node has the same probability of triggering the 
creation of a novel link. This provides load balancing among 
nodes. 

Then, upon reception of a message from a node p asking 
n to become neighbors, n accepts the request only if p is 
not a 1st or 2nd neighbor of n (it is possible that some of its 
neighbors just created a link with p; see Algorithm 2). Then, n 
answers this request through a direct message to p (Algorithm 
2, lines 7, 9). 

Upon creation of a novel link between two nodes, these 
nodes inform all their 1st neighbors that a novel link has been 
created (Algorithm 2, lines 2, 10). 

Finally, when a node n receives a message from a neighbor 
(say q) confirming the creation of a link between q and m, 
then n can remove m from the list of lost nodes in its 2nd 
neighborhood, since after this novel connections, n and m are 
2 hops away (Algorithm 2, line 5). 


Algorithm 1 P^n'- Active behavior at n upon failure of / 

> P contains old 2nd neighbors, reachable through / 
only, hence no more reachable in 2 hops after / failure 
1: p ^ {p G n^i^i p ^ n„, p ^ G np,g ^ /} 

2: update neighbor lists in view of / failure 

3: while (P 7 ^ 0) A (|n„| < thresholdDegree) do 
4: wait random time 

5: p •«— extract random node from P 

6: send link creation request to p 

7: end while 


Algorithm 2 P 2 n'- Passive behavior at n 
Require: message from p answering a link creation request 
1: if answer is OK then 
2 : sendAll(n„, “novel link (n,p)”) 

3: add p to n„ 

4: end if 

Require: message from q G n„: novel link {q,m),m € P 
5: extract m from P 

Require: message from p with a link creation request 
6: if p G Uf then 
7: send refuse message 

8 : else 

9: send accept message 

10: sendAll(n„, “novel link {n,p)”) 

11: add p to n„ 

12 : end if 


C. Protocol Pecc- Edge Clustering Coefficient 

This protocol is an extension of P2nj and it is based on 
the idea of exploiting the importance of failed links, so as to 
identify those that, once failed, must be replaced with novel 
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ones. In complex network theory, several centrality measures 
have been introduced to characterize the importance of a node 
or a link in a network, e.g. betweenness centrality, or to 
detect different communities and identify their boundaries in 
the net [4], [26], [27], [38]. The calculation of these metrics 
usually involves a full (or partially full) knowledge about the 
whole network. Conversely, the aim of this work is to preserve 
connectivity without such a global knowledge [19], [20], [34], 
[49]. 

The Edge Clustering Coefficient (ECC) has been defined 
in analogy with the usual node clustering coefficient, but it 
is referred to an edge of the network [42]. It measures the 
number of triangles to which a given edge belongs, divided by 
the number of triangles that might potentially include it, given 
the degrees of the adjacent nodes. More formally, given a link 
(n, m) connecting node n with node m, the edge clustering 
coefficient ECCn,m is 

T 

_ 

mtn((|n„|-l),(|n„|-l))’ 

where r„ m is the number of triangles built on that edge 
(n,m), and TOm((|n„| — 1), (111^1 — 1)) is the amount of tri¬ 
angles that might potentially include it. We add the constraint 
that this measure is 0 when there are no possible triangles at 
one of the nodes, i.e. when TOm((|n„| — 1), (jllml — 1)) = 0. 

The idea behind the use of this quantity is that edges 
connecting nodes in different communities are included in few 
or no triangles, and tend to have small values of ECCn,m- On 
the other hand, many triangles exist within clusters. Hence the 
coefficient ECCn,m is a measure of how inter-communitarian 
a link is. 

Thus, based on this notion of ECCn,m, the protocol Pecc 
works as follows. (Algorithm 3 shows the pseudo-code of the 
active behavior only, since the passive behavior is equivalent 
to Algorithm 2). Each node n knows its 2nd neighbors, 
i.e. 1st neighbors of its neighbors; thus, it can understand if 
some triangle exists that includes itself. Indeed, let say that 
three nodes n,m,p create a triangle. Then, n has m,p in its 
neighbor list n„ (and the same happens for the two other 
nodes). When n sends its list n„ to m and p, they recognize 
that there is a common neighbor that creates a triangle. If one 
of the three nodes would fail in the future, the other two nodes 
will understand automatically that the triangle no longer exists. 

When a node / fails, each neighbor n € Hf checks the 
value ECCnj- Depending on this value, a reconfiguration 
phase may be executed. The idea is that the higher the ECC 
the lower the need to create novel links to keep the network 
connected, since that link was part of multiple triangles. This 
decision is taken probabilistically, i.e. the lower ECCnj the 
more probable that the rest of the procedure is executed (line 3, 
Algorithm 3). If this is the case, n checks if its 2nd neighbors 
(H^l/), reached formerly through /, still remain in its 2nd 
neighborhood; otherwise it creates links with them, as in P 2 n- 

Due to the overlay reconfiguration, it is expected that the 
degree of a node changes (suddenly, in some cases). Indeed, 
the goal of the self-healing reconfiguration scheme is that 
the network should evolve to react to nodes arrivals and 
departures. Eor instance, if a hub goes down for some reason. 


Algorithm 3 Pecc- Active behavior at n upon failure of / 
> P contains old 2nd neighbors, reachable through / 
only, hence no more reachable in 2 hops after / failure 

1: P ^{p€ 1^1 p ^ n„, p ^ ^ e n^, g ^ /} 

2: update neighbor lists in view of / failure 

3 : if random)) > ECCnj then 

4: while (P 7 ^ 0 ) A (|n„| < thresholdDegree) do 

5: wait random time 

6: p -(r- extract random node from P 

7: send link creation request to p 

8 : end while 

9: end if 

Require: (|n„| > A (Pn ^ Cn,target') 

10: Remove at most r links with ECC > Tecc 


it is likely that its past neighbors will create more links in order 
to maintain the overlay connected. Thus, it might happen that 
the total number of links augments, due to the parallel activity 
of nodes, and this can alter the network topology. In Pecc, 
this is more probable when there is a low network clustering, 
with few triangles. 

To overcome this possible problem, a periodical check is 
accomplished on the growth of links at each node and its 
neighborhood. Thus, periodically each node n checks its actual 
degree |n„| and the actual number of links in its neighborhood 
L„, i.e. the sum of all different links departing from n„U{n}. 
These values are compared with two values that n stores, 
related to the target degree \Iln\target and a target number 
of links in the n’s neighborhood Ln,target- By monitoring the 
amount of links in its neighborhood, n obtains an approximate 
understanding of how the network is evolving. (These two 
values are periodically updated, based on values assumed in a 
window time interval.) 

In case of an important increment on the amount of links in 
some portion of the network, then the nodes with the higher 
variations on their degrees check if some links (i.e. those with 
higher ECC values) can be removed. Indeed, if the difference 
between the target values and the actual ones surpasses a given 
threshold, then the node n invokes a procedure that removes its 
r links with higher ECC values (larger than a threshold value 
Tecc), if there are any. (In the simulations, we consider r = 1 
since it suffices to control the rate of the periodical check to 
increase/decrease the number of links that can be removed.) 

IV. Evaluation Assessment 

Simulation was used to assess the performances of the 
proposed algorithms. In these simulations, we varied; 

• the topology of the unstructured overlays over which the 
approaches were executed. In particular, we employed 
uniform networks, clustered networks and scale-free net¬ 
works. It is worth mentioning that simulations were made 
also on classic random graphs, but we omit results here, 
since they are similar to those obtained for uniform 
networks. 
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• the types of simulation. We simulated (i) the classic 
scenario where the network evolves with an equal amount 
of joining nodes and leaving nodes (i.e., equal join and 
fail rate probabilities); (ii) a case similar to the previous 
one, but nodes to be removed are those nodes that 
might have some important role in the network, i.e., we 
performed two types of simulations where removed nodes 
were those with highest degrees in one case, and those 
with highest betweenness in the other case; (iii) the case 
when only failures occur. 

The considered approaches are P2n, Pecc and “none”, which 
represents the (typical) situation when peers do not react to 
node disconnections, simply assuming that other links will be 
created upon arrivals of novel nodes. Details of the simulation 
are discussed in the next subsection. 

A. Simulation Details 

The simulator was a discrete event simulator, implemented 
using the GNU Octave language and the Octave-network- 
toolbox, a set of graph/networks analysis functions in Octave 
[1]. Based on it, we assume that the communication among 
peers is reliable, with a latency that is negligible, with respect 
to the inter-arrival times of overlay related events (e.g., node 
arrivals and departures), times required by the failure detector 
to identify a node failure, and so on. Hence, once a message is 
sent from a node to another, the communication can be thought 
as instantaneous and completely reliable. This is a common 
approximation that relieves the simulation dealing with all the 
underlying communication related issues, simply focusing on 
the overlay parameters. Other P2P discrete event simulators 
offer similar abstractions, e.g. PeerSim [36], P2PSim [24], 
PlanetSim [2], TUNES [13], SimGrid [9]. 

We present results averaged from a corpus of 20 simulations 
for the same scenario. In each simulation, we started with 
an overlay network with a specified degree distribution and 
network characteristics, and let the simulation advance for an 
amount (~ 100) of simulation steps. All the configuration 
parameters were varied; we present here results for some par¬ 
ticular configuration settings, since those obtained for different 
ones were comparable to those we will show. 

Upon a node failure, all its links with other nodes are 
removed. Then, the node passes to an inactive state; it can 
be selected further on to simulate a novel node arrival. Thus, 
a node arrival is realized by changing the state of a randomly 
selected inactive node to pass to the active state. This event 
triggers the creation of novel links with other randomly 
selected nodes. Different joining procedures were executed, 
depending on the network topology under investigation. The 
idea was to adopt a join mechanism that would maintain the 
topology unaltered. 

Both protocols P2n, Pecc employ a threshold on the 
maximum degree. In Section 5.6, we show the impact of 
varying this value; when not differently stated, the threshold 
was set equal to 100. As a matter of fact, the threshold strongly 
depends on the P2P system one wants to build, on the specific 
application run on top of of the overlay, and on the typical 
number of connections a peer maintains during its lifetime 


in the network. Thus, it should be tuned with this in view. 
For instance, BitTorrent sets the maximum degree for peers 
equal to 80 (then, each peer limits the amount of connections 
contemporaneously active, using the choke algorithm) [47]. 
Gnutella has a degree distribution that follows a power law 
function; a snapshot made in 2000 revealed that nodes had a 
maximum degree equal 136, with a median value of 2 , and an 
average of 5.5 [47]. In PPlive, the average node degree varies 
in a small range between 28 to 42 over the course of the day, 
with no correlation between the variation of average degree 
and the channel size. The overlay resembles a random graph 
when net size is small (around 500 nodes) but becomes more 
clustered when net size grows [50]. For this reason, a specific 
static value is not proposed in this work; however, results will 
show that changing the threshold on the maximum degree 
can lower significantly the amount of 1 st and 2 nd neighbors, 
without evident differences on the size of the main component. 

B. Network Topologies 

As already mentioned, we employed three different kinds 
of overlay topologies, varying their specific parameters. In the 
following, the general characteristics of such topologies are 
described, together with the method employed to simulate the 
arrival of a novel node in the network, that is accomplished 
to respect the typical attachment process of that topology. 

As concerns node removals, a related subsection is reported 
in the following of this section. 

1) Uniform Networks: Uniform networks are those where 
all nodes start with the same degree. Then, due to node 
failures and arrivals (and the reconfiguration imposed by the 
P2P protocol), the node degree might change. We varied the 
initial degree of nodes. Uniform networks are quite common 
in several (P2P) systems, where the software running on peers 
is configured to have a given number of links in the overlay. 
This is usually accomplished for load balancing purposes [31]. 

As concerns the arrival of a novel node, a random set of 
neighbors was selected, whose size was equal to the initial 
degree parameter. Of course, this causes an increment of 
nodes’ degree that accept such a novel link. However, it does 
not alter the general idea of a network topology where all 
nodes have the same importance (uniform). 

2) Clustered Networks: The presented self-healing proto¬ 
cols are thought for those P2P overlays that have important 
links that connect different parts of the network; thus, it is 
interesting to observe how the protocol performs over nets 
composed of different connected clusters. In these simulations, 
network clusters were set to be of the same size. 

We set two different parameters to create the network. The 
first parameter is the probability 7 of creating a link among 
nodes of the same cluster. Each node is linked to another 
node of the same cluster with a probability 7; hence, inside 
a cluster, nodes are organized as a classic random graph. As 
to inter-cluster links, the amount of links created between the 
two clusters was determined based on a certain probability uj 
times the number of nodes in the clusters (i.e. each node has 
a probability w of having a link with each external cluster). 

Upon a node arrival, the node was associated to a cluster and 
links with nodes in that cluster were randomly created based 
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on the 7 probability, as in a classic random graph. Then, for 
each other cluster, the node creates, with probability w, a link 
with a random node of that cluster. 

3) Scale-Free Networks: A scale-free network possesses 
the distinctive feature of having nodes with a degree distribu¬ 
tion that can be well approximated by a power law function. 
Hence, the majority of nodes have a relatively low number 
of neighbors, while a non-negligible percentage of nodes 
(“hubs”) exists with higher degrees [12]. The presence of hubs 
has an important impact on the connectivity of the net. In 
fact, the peculiarity of these networks is that they possess a 
very small diameter, thus allowing to propagate information 
in a low number of hops. To build scale-free networks, our 
simulator implements the construction method proposed in [3]; 
but we used also a classic preferential attachment generation 
approach, using a specific routine available in the Octave- 
network-toolbox [1], [37]. 

Upon a node arrival, a preferential attachment was utilized 
for scale-free networks. That is, the higher the degree of a 
node the more likely it is to receive new links. Thus, the more 
connected nodes have stronger ability to obtain novel links 
added to the network. This is the typical approach that leads 
to the formation of scale-free networks [12], [37]. 

When not differently stated, we employ four different scale- 
free networks, with different characteristics. In fact, the first 
two networks are composed by a small amount of nodes 
(following a power-law degree distribution), that result in 
disconnected networks. Instead, the other two networks are 
composed of a main component, with the presence of impor¬ 
tant hubs that provide this connectivity. 

C. Simulation Scenarios 

We evaluated the presented approaches using different sim¬ 
ulation modes, that basically differ in the way nodes were 
selected to be removed from the overlay, and if, during the 
simulation, novel nodes were allowed to enter the network or 
not. 

1) Evolution: The first mode was based on a random 
selection of failed nodes, with an amount of failed nodes equal 
to the amount of joining nodes. This way, the network size 
remains stable during the simulation. 

2) Targeted Attack to Nodes with Highest Degree: In this 
case, at each step of the simulation the “important” nodes with 
some specific characteristics were selected to fail. In particular, 
as concerns uniform and scale-free networks, nodes with the 
higher degrees were selected to fail. Instead, in clustered 
networks the selected nodes were those with higher number 
of links connecting different clusters (i.e. the highest inter¬ 
cluster degree); the rationale was to augment the probability 
of disconnecting the clusters. In this scheme, as in the previous 
simulation mode, the amount of failed nodes per simulation 
time interval was kept equal to the amount of joining nodes. 

3} Targeted Attack to Nodes with Highest Betweenness: 
This simulation type is similar to the targeted attack to nodes 
with highest degree. However, instead of selecting the node 
with highest degree (or highest inter-cluster degree in the case 
of clustered networks), the simulator detected the node to fail 
as that with highest node betweenness. 


Betweenness is a centrality measure that, given a node in 
a network, calculates the number of shortest paths from all 
nodes to all others which pass through that node. Thus, if a 
node n has a high betweenness, it means that several paths in 
the overlay pass through n. Or, in other words, if you plan to 
go from a node to another in an overlay, it is quite probable 
that you will encounter n during your path. Nodes may have 
a low node degree but high betweenness.^ 

The formula for measuring the betweenness of a node 
n is as follows. Assume that the amount of shortest paths 
between two nodes m,p is denoted with amp', with amp{n), 
we denote the amount of shortest paths between m,p passing 
through n. Then, the betweenness of n is measured as the 
fraction between the number of shortest paths passing through 
n, divided by the amount of shortest paths in the network, 

i-e. bet(n) = E™^„^p^- 

It should be clear that the removal of a node with high 
betweenness centrality can lead to an increment of the path 
lengths and to network disconnections. Thus, this targeted 
attack is of main interest in our study. 

4) Failure Churn: In this case, during the simulation only 
failures occurred. Thus, each network started with all nodes 
active, which were (randomly) forced to fail until no active 
nodes remain in the network. This allows to understand if the 
self-healing protocols are able to react to situations with high 
failure rates. We refer to this simulation mode as “failures 
only”. 

V. Results 

This section discusses on the results obtained in the sim¬ 
ulation scenarios described above. A first comment worth of 
mention is that the considered approaches do not increase the 
connectivity of the network overlay being utilized. In fact, P 2 n 
and Pecc restore connections with lost 2nd neighbors, with¬ 
out looking for novel nodes. Thus, the obtained connectivity 
is at most equal to the initial one (we will see that these two 
approaches are able to maintain it, while the “none” approach 
is not able to do it). 

Another result is that the two approaches augment, in some 
cases, the average number of 2nd neighbors in the network. 
This happens especially upon removal of an important node 
(in terms of connectivity) n. In fact, in this case, the remaining 
nodes have to reorganize their connections. This might lead 
to the creation of multiple links (in spite of a single link) 
to connect to local clusters, previously reached through n. 
While the average amount of 1st neighbors is not particularly 
affected by the substitution of a single link to multiple ones, 
this multiplicative factor is more evident when counting the 
amount of novel 2nd neighbors (especially when the clustering 
coefficient is low). 

While mentioned in the description of Pecc, in these 
experiments the link reduction was not activated. The idea was 
to understand if that protocol is able to guarantee network 

^E.g., imagine to have two separated clusters in a network and a single 
node n that performs as intermediate, which is linked to a single node for 
each cluster. In this example, n has a low degree (equal to 2) but a high 
betweenness value, since all paths among two nodes in the different clusters 
have to pass through n. 



connectivity. Thus, one should keep in mind that when the 
amount of added links becomes too high (and this is a metrics 
which depends on the specific application requirements), one 
can reduce it by removing unnecessary ones. 

A. Evolution 

This is the simulation scenario where nodes enter and leave 
the network at the same rate. Leaving nodes are selected 
at random. Moreover, nodes that enter do respect the type 
of attachment related to the overlay topology. In fact, for 
uniform nets, neighbors are selected at random; for clustered 
nets, nodes are randomly assigned to a cluster and neighbors 
are randomly selected in that cluster (then, some links might 
be created among different clusters with a lower probability, 
as previously discussed); for scale-free nets, a preferential 
attachment is performed. Thus, we do not expect that failures 
introduce relevant connectivity problems, and the use of P 2 n, 
Pecc might be not necessary, in this case. In any case, we 
thought it would be interesting to understand how these self- 
healing protocols perform. 

1) Uniform Networks: Figure 2 shows results for uniform 
networks. The top chart reports the average size of the main 
component for the three considered management protocols, 
while the other charts report the average amount of 1st 
neighbors (bottom, left) and the amount of 2nd neighbors 
(bottom, right). 

As expected the failure of nodes does not create particular 
problems, since others arrive in the meantime. Thus, the 
topology remains pretty much unvaried. It is interesting to 
observe that however, when the amount of links is low, a 
small portion of nodes of the network can remain outside the 
main component when no failure management mechanisms are 
employed (see “none” curve on the left chart). 

Another interesting aspect is that, while small variations 
on the average amount of 1st neighbors is noticed for the 
three schemes (the “none” protocol has a slight lower average 
value than the other two approaches), the average amount 
2nd neighbors is significantly lower for the “none” protocol 
w.r.t. P2n, Pecc- In particular, with respect to the initial 
value, this measure decreases, on average, for “none”, while 
it increases with P2n, Pecc- This increment was expected. 
We are running the protocols in the evolution mode, thus 
nodes leave and enter the overlay at the same pace. When 
entering the network, novel nodes randomly create their initial 
amount of links, by randomly selecting their neighbors. Hence, 
the general network topology remains unchanged during the 
evolution. 

The two self-healing protocols are local. Hence, they are 
thought to avoid that a node loses connections with some 
nodes in its 2nd-neighborhood. When we add this kind of 
approaches to a network that evolves in a stable manner (on 
average), the amount of links in the network will increase. 
Depending on the application requirements, whenever this 
property is undesired, one might couple the protocol with 
the mentioned link reduction process, or by employing a low 
threshold on the maximum degree. Indeed, we will see in 
Section V-F that changing the threshold on the maximum 


degree can lower significantly the amount of 1st and 2nd 
neighbors, without evident differences on the size of the main 
component. 

2) Clustered Networks: When dealing with clustered net¬ 
works, also in this case a random removal of nodes (“evo¬ 
lution” simulation mode) does not alter significantly the 
topology; hence, as concerns the main component size no 
particular benefits are evident from the use of P2n and Pecc 
w.r.t. “none” (see Figure 3). 

An interesting result is that with the “none” protocol a lower 
average node degree is measured, while higher values are 
obtained with P2n and Pecc- In particular, Pecc provides 
values which are nearer the initial ones. As for uniform 
networks, the amount of 2nd-neighbors increases with P 2 n 
and Pecc- 

3) Scale-Free Networks: Under the simulation evolution, 
no noticeable differences are evident for scale-free networks 
(see Figure 4). One might notice that the first two considered 
scale-free networks are very disconnected ones. Hence, even 
if the degree distribution follows a power law, there are no 
real hubs that do connect all subnetworks. 

B. Targeted Attack to Node with Highest Degree 

1) Uniform Networks: When considering the targeted at¬ 
tack simulation mode with uniform networks results are not 
that different to those performed during the evolution. Indeed, 
there are no important differences between nodes, since all 
start with the same initial degree, during the network evolution 
links are established arbitrarily, and there are no important 
hubs in the network. Thus, the selection of the node with 
highest degree has not a significant impact on the topology 
(see Figure 5). Nevertheless, it is possible to appreciate that 
the numbers of 1st and 2nd neighbors decrease for the none 
protocol, w.r.t. results obtained for the evolution simulation 
modes. Similarly, in the none protocol the average size of 
the main component results lower w.r.t. that obtained in the 
evolution simulations. 

Conversely, as concerns the average main component size, 
results remain unchanged for P 2 n and Pecc- Instead, the 
numbers of 1st and 2nd neighbors increase. This can be 
explained as follows. Uniform networks are quite similar to 
random graphs, as links are established arbitrarily. Hence, 
there is a low clustering. Let consider a node n; upon failure 
of one of its neighbors, let say node /, due to the network 
topology it is unlikely that n has as 1st neighbors the nodes 
that were connected to /. Thus, P2n, and Pecc will force n 
to create novel links with /’s neighbors. This is even more 
evident if we select the nodes with highest degree to fail. 

As previously stated, the approach to adopt, in order to 
cope with this possible issue, is application dependent. If the 
increment mentioned above is undesirable, one might employ a 
link reduction process, adding a limit on the maximum degree 
when creating links, or more drastically, turning off the self- 
healing protocols. As we will see in Section V-F, in some 
cases the introduction of a lower threshold on the maximum 
node degree does not alter the connectivity provided by P 2 n 
and Pecc- 
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Fig. 2. Uniform networks - evolution simulation mode. 


2 ) Clustered Networks: Clustered networks are particularly 
affected by the selection of targeted nodes with highest inter¬ 
cluster degrees. Indeed, with the “none” protocol the average 
size of the main component is highly reduced, while the 
two self-healing protocols P2n, Pecc maintain a high (full) 
connectivity, as shown in Figure 6. This conhrms the goodness 
and usefulness of the proposed protocols in these situations. 
The increment on the average amount of 1st neighbors is 
limited, with Pecc that provides a slightly lower increment 
with respect to P 2 n- Conversely, the use of the self-healing 
protocols causes an increment on the amount of 2nd neighbors 
(again, Pecc has a lower increment w.r.t. P2n)- This is 
explained by the fact that, based on the clustered topology, 
only a limited amount of nodes have links with nodes in other 
clusters. Without the self-healing protocols, these clusters 
become disconnected. Instead, with the self-healing protocols 
the neighbors of the failed node share the task of replacing 
these inter-cluster connections. Thus, it is likely that multiple 
nodes create links towards other clusters (and also some links 
within the cluster). 

3) Scale-Free Networks: The two protocols work well even 
for scale-free networks under the targeted attack. Figure 7 
shows that P 2 n and Pecc guarantee high connectivity, at 
the cost of a little increment on the average degree. But 
again, this is expected, since while hubs fail, there are other 
nodes that enter the network at the same rate. Conversely, the 


connectivity level decreases without the use of a self-healing 
protocol (i.e., “none” protocol). This is a well known result in 
the literature, as it has been recognized already that scale-free 
networks are not resilient to targeted attacks [37]. 

C. Failure Churn 

As mentioned, this is the scenario where nodes progres¬ 
sively fail. It is an interesting experiment to assess whether 
the protocols are able to cope with extreme churn. 

1) Uniform Networks: Figure 8 shows results obtained un¬ 
der the “failures only” simulations with uniform networks. In 
particular, the chart on the left shows the amount of nodes that 
remain in the main component, while nodes continuously fail. 
(We repeated the same experiment multiple times, varying the 
network size, the initial nodes’ degree, and the seed for random 
generations, obtaining comparable results.) It is possible to see 
that, in the “none” protocol, at a certain point of the simulation 
the network gets disconnected and the percentage of active 
nodes in the main components decreases. Instead, in P 2 n and 
Pecc, active nodes remain connected in the same, single 
component. This is conhrmed by looking at the chart on the 
right in the same hgure, which shows the amount of isolated 
nodes. While the percentage of isolated nodes increases in the 
“none” scheme, no nodes remain isolated for the other two 
protocols. 
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Fig. 3. Clustered networks - evolution simulation mode. 


2) Clustered Networks: Figure 9 shows results obtained 
with clustered networks in the “failures only” simulation 
mode. As for uniform networks, the chart on the left shows 
the amount of nodes that remain in the main component 
during the evolution, while nodes continuously fail. In this 
case, the network was disconnected, in the sense that the main 
component comprised only a percentage (slightly over 25%) 
of the whole set of nodes. We might see that in this case, 
the main component size remains almost stable, for all the 
three protocols, until a half of the nodes become disconnected. 
This is due to the fact that the random choice of the failing 
nodes would privilege those nodes that were not in the 
bigger component (that includes less than the 30% of nodes). 
However, in the last part of the simulation run, the “none” 
protocol experiences a progressive decrement of nodes in the 
main component, since the main component is partitioned by 
the failures of its nodes. Conversely, the size of the main 
component increases for P2n and Pecc- This is explained by 
the presence of the failure management protocols, that prevent 
the partition of the components. The chart on the right of the 
figure conhrms this, by reporting the amount of isolated nodes. 
While this amount progressively increases with the “none” 
protocol, with P2n, Pecc the percentage of isolated nodes 
remains negligible for the main part of the simulation. Only 
at the end of the simulation some non-negligible amount of 
isolated nodes appears. This is explained by the fact that after 


a while some (minor) component remained composed of a 
single node (all other nodes already failed). 

3) Scale-Free Networks: Figure 10 reports results for a 
“failures only” simulation mode, run on a scale-free network 
composed of 636 nodes, with a maximum degree of 20 (for 
those interested in the specific construction method [3], it 
employs two parameters that in this case were set to a = 6, 
b = 2). By looking at the chart on the right, it is possible to see 
that the simulation starts with a main component composed 
of more than the 70% of the nodes. In the “none” mode, 
the component size progressively loses all its nodes, while in 
the P2n and Pecc protocols, the main component maintains 
its size (which actually increases in percentage, upon failure 
of nodes outside the main component). Actually, in this case 
Pecc outperforms P 2 n- This is confirmed by the chart on the 
right in the figure, that reports the amount of isolated nodes. 

D. Targeted Attack to Nodes with Highest Betweenness 

It is generally accepted that in many networks the larger the 
degree the larger the betweenness [29]. The idea is that the 
higher the degree of a node the higher the probability that a 
path might pass through it. However, as previously stated this 
depends on the network topology. 

As concerns scale-free networks, for instance, it has been 
noticed that, unless the network has been built with a high 
level of disassortativity (i.e. high repulsion between hubs), in 
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Fig. 4. Scale-free networks - evolution simulation mode. 


general there is a high correlation between the degree of a 
node and its betweenness centrality [33]. In this case, results 
obtained with a targeted attack to nodes with higher degrees 
are comparable to those obtained with targeted attacks to nodes 
with highest betweenness. 

When considering clustered networks, instead, nodes with 
higher betweenness might be those that are connected with 
different clusters. Thus it is important to study this kind of 
attack when dealing with clustered networks. For this reason 
and for the sake of conciseness, we focus here on clustered 
networks only. 

1) Clustered Networks: In this case, the discrepancy be¬ 
tween the two self-healing protocols P2n, Pecc and “none” 
is even more evident than in other cases. In particular, the 
connectivity provided by “none” is signihcantly lower than the 
other two approaches (see Figure 11). The average amounts 
of 1st and 2nd neighbors decrease with “none” with respect to 
the original topologies, while these values increase with P 2 n 
and Pecc- However, the increment with Pecc is lower than 
with P 2 -n- 

E. Variation of the Node Degree 

In order to assess how the node degree is altered by the 
use/non-use of the self-healing protocols, we report in this 
subsection how the nodes’ degree changes, on average. 


We consider only those nodes that experience a degree 
variation during the simulation. Hence, this is not an average 
of all nodes (the average variation of the node degree on the 
whole peer set results quite lower). However, this measure 
gives an idea on local alterations in the networks. For the 
sake of conciseness, we consider the targeted attack simulation 
mode only. 

Figure 12 shows the variations of node degrees, in modulus, 
with different configurations of the three considered types of 
network topologies. It is possible to notice that, as expected, 
since the network evolves, the node degree varies, and this 
is more evident with the use of the self-healing protocols. 
It seems also that Pecc has slightly lower variations, with 
respect to P 2 n- 

F. Impact of the Threshold on the Maximum Node Degree 

We mentioned that the two self-healing protocols P 2 n, 
Pecc employ a threshold on the maximum degree a node 
might have. In the previous subsections, this parameter was 
set equal to 100, which might be a high (considering the sizes 
of the employed networks) but quite reasonable value for P2P 
systems. In this section, we study the impact of this threshold. 
In fact, this parameter can be tuned to obtain a good trade-off 
between the ability of the protocol to guarantee connectivity, 
and imposing a limit on the variation of the node degrees. 

In this case, for the sake of conciseness we will focus on 
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scale-free networks under targeted attack only. The choice of 
this topology is due to the presence of hubs that have a node 
degree much higher than the majority of other nodes. When 
this network is under a targeted attack, hubs are removed, 
causing network partitions. Thus, the self-healing approaches 
become very important in this case. 

Figures 13 and 14 show, for P 2 n and Pecc, respectively, 
the differences on the use of a threshold on maximum node 
degree set equal to 20 and 100. Note that a low threshold 
value, such in the case of 20, means that upon failure of a 
hub, no node will be available to replace its role, since the 
amount of novel connections it can create is limited. Thus, 
novel links, created to maintain network connectivity, must 
be shared among nodes. It is possible to notice that while 
the average amounts of 1st and 2nd neighbors decrease with 
a lower threshold, the connectivity of the network remains 
almost unchanged. This is a very important result, confirming 
that the tuning of the parameters in P 2 n, Pecc, depending 
on the topology in use, can guarantee the effectiveness of the 
self-healing protocols, without altering that much the nodes 
workload. 

Figure 15 shows how the variation on the degree of a 
network changes when the threshold is modihed. It is possible 
to observe an important reduction of this gap with a lower 
threshold. 

To conclude this discussion, it is worth mentioning that the 


tuning of this threshold parameter is not the sole option to 
control the growth of the node degrees in presence of a churn. 
The self-healing protocols can be coupled with a link reduction 
approach, that might remove redundant links (e.g., those with 
high ECC). It is important to notice that this would alter the 
clustering of the overlay. Another option can be to avoid the 
use of a hxed threshold on the node degrees, but rather to 
set the threshold based on the variation of the actual degree 
of a node, w.r.t. its initiaFtarget degree. The idea is that the 
fluctuations of the nodes degree should not surpass some limit. 
However, this might be a problem in certain topologies. For 
instance, if we consider a scale-free network, the failure of a 
hub means that several links are removed from the network. If 
remaining nodes want to maintain network connectivity, they 
need to replace in some way these lost links, and this would 
likely result (in some cases) in an increment of node degrees. 
The use of this hypothetical approach could be in contrast with 
this issue. 

G. On the Clustering Coefficient and Network Diameter 

In this subsection, we will look at the influence of P 2 n and 
Pecc on the network clustering coefficient and on the network 
diameter. The idea was to analyze the resulting networks when 
the self-healing protocols are executed on an evolving P2P 
system. 

As to the clustering coefficient, previous works assert that 
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it is undesirable for an unstructured P2P overlay to have high 
clustering [49]. In fact, clustering reduces the connectivity of 
a cluster to the rest of the net, increases the probability of 
partitioning, and it may cause redundant message delivery. 

As to the network diameter, it is evident that the lower the 
diameter the faster the message dissemination in the overlay. 

1) Uniform Networks: Figure 16 shows how the clustering 
coefficient and the diameter change in a typical uniform 
network when the evolution simulation mode is employed 
(the test was repeated multiple times with different networks, 
obtaining the same qualitative results). It is possible to observe 
that the use of P 2 n and Pecc lowers the clustering coefficient, 
as the uniform network evolves. Moreover, Pecc has a higher 
decrement. Conversely, as expected “none” protocol maintains 
a stable clustering coefficient, since the network evolves as a 
typical unstructured uniform network, i.e., nodes enter and 
randomly select a fixed amount of novel neighbors. 

As shown in the figure, with the “none” approach, the 
network experiences an increment of the network diameter, 
while P2n and Pecc allow maintaining a constant diameter. 

This confirms the viability of the two proposals for the sup¬ 
port of P2P overlays. Moreover, Pecc allows differentiating 
the links created by neighbor nodes. 

2) Clustered Networks: Figure 17 shows the variation of 
the clustering coefficient and diameter during an exemplar 
evolution of the simulation with a clustered network. In this 


case, the decrement of the clustering coefficient is sensible 
for Pecc, while P2n has a minor impact on this metric. The 
diameter of the network decreases with both protocols. This 
allows concluding that one might decide if turning to Pecc or 
P 2 n if such reduction of the clustering coefficient is a desired 
effect (as commonly stated [49]) or not. 

3) Scale-Free Networks: The impact noticed for other net¬ 
works is not evident in scale-free networks, under the evolution 
simulation mode. In fact, in this case the hubs do maintain their 
main role in the network. The scale-free networks were gen¬ 
erated using a classic preferential attachment approach, using 
a specific routine available in the Octave-network-toolbox [1], 
[37]. Figure 18 shows the clustering coefficient and diameter 
variations that, in this case, are negligible. Different scale-free 
networks with varying network sizes were considered; results 
showed the same trend in all cases. 

VI. Conclusions 

This paper focused on two distributed mechanisms that can 
be executed locally by peers in an unstructured P2P overlay, in 
order to cope with node failures and augment the resilience of 
the network. The two self-healing protocols require knowledge 
of 1st and 2nd neighbors. Outcomes confirm that it is possible 
to augment resilience and avoid disconnections in unstructured 
P2P overlay networks. 

In particular, while both schemes help to avoid network 
disconnections, our results suggest that the use of the Edge 
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Fig. 7. Scale-free networks - targeted attack simulation mode. 


Clustering Coefficient (ECC) provides some additional advan¬ 
tages during the self-healing phase. In fact, ECC provides an 
idea of how much inter-communitarian a link is. It can be thus 
exploited to; i) replace lost important links with novel ones 
after some failures, ii) (if needed) remove those (novel) links 
that might augment excessively the degree of some node and 
the amount of triangles it belongs, iii) reduce the clustering 
coefficient of the overlay (depending on its topology). 

The two self-healing protocols are local. They avoid that 
a node loses connections in its 2nd-neighborhood. When we 
employ them in a network that evolves in a stable manner (on 
average), the amount of links in the network will increase, as 
noticed in the evaluation assessment. However, it is possible to 
limit this increment, while maintaining network connectivity. 
Thus, depending on the application requirements, whenever 
such an increment is undesired, it is possible to couple the self- 
healing protocols with a link reduction process, or by setting 
a low threshold on the maximum degree. 

The employed system model assumes that only nodes can 
fail; hence there are no single links removals. This simpli¬ 
fication does not introduce important limitations, since the 
protocol can be easily upgraded (without any substantial 
modifications) to handle single link failures. 

Moreover, the model assumes that network changes (in a 
given neighborhood) are slower than the execution of a step 
of the self-healing protocols. This is a common assumption. 


that enables nodes self-repairing network partitions through 
local interactions only. However, scenarios are not considered 
when a network is partitioned by the simultaneous failure 
of a node set, so that nodes in the remaining components 
have no information about other components (i.e., given two 
nodes in different components after the churn, the distance 
between these two nodes before the churn was higher than 
2). This prevents the creation of novel links to repair the 
partition. This is an uncommon situation, that can be faced in 
different ways. Increasing the local knowledge at peers would 
be of help. Eor example, peers could store in their caches a 
subset of A:th neighbors, so that the amount of node entries at 
distance k is inversely proportional to k. (This to avoid that 
the global amount of stored data increases exponentially.) This 
approach, coupled with a gossip protocol, might help to find 
novel connections that would repair such kinds of partitions. 
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Fig. 13. Scale-free networks, targeted attack - impact of the threshold on maximum node degree with P 2 n- 
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Fig. 16. Uniform networks - clustering coefficient and diameter of a typical network during the evolution simulation mode. 
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Fig. 17. Clustered networks — clustering coefficient and diameter of a typical network during the evolution simulation mode. 



Fig. 18. Scale-Free networks - clustering coefficient and diameter of a typical network during the evolution simulation mode. 
































